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Abstract. We show that every subset of SL2(Z/pZ) grows rapidly when it acts on 
itself by the group operation. It follows readily that, for every set of generators A 
of SL<2(Z/pZ), every element of SL2(Z/pZ) can be expressed as a product of at most 
0((logp) c ) elements of A U A -1 , where c and the implied constant are absolute. 



1. Introduction 

1.1. Background. Let G be a finite group. Let A C G be a set of generators of G. By 
definition, every g £ G can be expressed as a product of elements of A\JA~ 1 . We would like 
to know the length of the longest product that might be needed; in other words, we wish 
to bound from above the diameter diam(r(G, A)) of the Cayley graph of G with respect 
to A. (The Cayley graph F(G,A) is the graph (V,E) with vertex set V = G and edge set 
E = {(ag,g) : g £ G,a £ A}. The diameter of a graph X = (V,E) is max„ 1)t , 2e y d(v\, V2), 
where d(v\,V2) is the length of the shortest path between v% and vi in X.) 

If G is abelian, the diameter can be very large: if G is cyclic of order 2n + 1, and g is 
any generator of G, then g n cannot be expressed as a product of length less than n on the 
elements of {g, g' 1 }- However, if G is non-abelian and simple, the diameter is believed to 
be quite small: 

Conjecture (Babai, [BSj ). For every non-abelian finite simple group G, 

(1.1) diam(r(G,A))«(log|G|) c , 

where c is some absolute constant and \G\ is the number of elements of G. 

This conjecture is far from being proved. Even for the basic cases, viz., G = A n and 
G = PSL2(Z/pZ), the conjecture has remained open until now; these two choices of G 
seem to present already many of the main difficulties of the general case. 

Work on both kinds of groups long predates the general conjecture in [BSJ. Let us 
focueO on G = SL 2 (Z/pZ). There are some classical results for certain specific generators. 
Let 
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1 While SL2(Z/pZ) is not simple, the statement (|l.ip for SL2(Z/pZ) is trivially equivalent to (jl.ip for 
PSL2(Z/pZ), and treating the former group is both slightly more conventional and notationally simpler. 
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Selberg's spectral-gap theorem for SL2(Z)\H ((Ssj) implies that {r(SL<2(Z/pZ), A)} p >5 is 
a family of expander graphs (vd., e.g., |Lu] . Thm. 4.4.2, (i)). It follows easily that 

diam(r(SL 2 (Z/pZ), A)) < log p. 

Unfortunately, this argument works only for a few other choices of A. For example, no 
good bounds were known up to now for diam(r(SL2(Z/pZ), A)) with, say, 

<-> HO ;).(;;)}• 

let alone for general A, uniformly on A or not. 

1.2. Results. We prove the conjecture for G = SL/2(Z/pZ). 

Main Theorem. Let p be a prime. Let A be a set of generators of G = SL2(Z/pZ). Then 
the Cayley graph T(G,A) has diameter 0((logp) c ), where c and the implied constant are 
absolute. 

The theorem is a direct consequence of the following statement. 

Key Proposition. Let p be a prime. Let A be a subset o/SL2(Z/pZ) not contained in 
any proper subgroup. 

(a) Assume that \A\ < p 3 ~ s for some fixed 5 > 0. Then 

(1.4) \A- A- A\> c\A\ 1+t , 

where c > and e > depend only on 5. 

(b) Assume that \A\ > p s for some fixed 6 > 0. Then there is an integer k > 0, 
depending only on 5, such that every element o/SL2(Z/pZ) can be expressed as a 
product of at most k elements of Au A" 1 . 

The crucial fact here is that the constants c, e and k do not depend on j) or on 4. 

It follows immediately from the main theorem (via [DSC] . §2, Lem. 2, §3, Cor. 3.1, and 
§3, Cor. 3.2) that the mixing time of r(SL2(Z/pZ), A) is 0(|j4| (logp) 2c+1 ), where c and 
the implied constant are absolute, and c is as in the main theorem. (The mixing time is 
the least t for which a lazy random walk of length t starting at the origin of the Cayley 
graph has a distribution of destinations close to the uniform distribution in the l\ norm; 
vd. © 

If A equals the projection of a fixed set of generators of a free group in SL2(Z) (take, 
e.g., A as in (II. 2p or (|1.3p ) it follows by a simple argument that A must grow rapidly at 
first when multiplied by itself. In such a situation, we obtain a bound of 

diam(r(SL 2 (Z/pZ), A)) <C logp, 

where the implied constant depends on the elements of SL2(Z) of which A is a projection. 
For (|1.3p and most other examples, this bound is new; for A as in (jl.2p . it is, of course, 
known, and the novelty lies in the proo^- 

What is given here is not, however, the first elementary proof for the choice of A in JL2); see [SX] , The 
proof in |SX) works for all projections of sets generating finite-index subgroups of SL.2(Z). Gamburd Gal 
succeeded in extending the method to projections of sets generating subgroups of SL2(Z) whose limit sets 
have Hausdorff dimension greater than 5/6. 
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If A is a random pair of generators, then, with probability tending to 1 as p — > oo, the 
graph r(SL/2(Z/pZ), .A) does not have small loops (see $6]). It then follows from the key 
proposition that diam^SL^Z/pZ), A)) -C logp, as ventured by Lubotzky ( |Luj . Prob. 
10.3.3). The implied constant is absolute. 

1.3. Techniques. The tools used are almost exclusively additive-combinatorial. Fourier 
analysis over finite fields and Ruzsa distances are used repeatedly. Both Gowers's effective 
version of the Balog-Szemeredi theorem ( |Gol| ) and the sum-product estimates in [BKTj 
and |Ko] play crucial roles. It is only through [Koj that arithmetic strictly speaking 
plays a role, viz., in the guise of an estimate proved in [HBK] with techniques derived 
from Stepanov's elementary proof of the Weil bounds. The Weil bounds themselves are 
not used, and even the use of [Koj becomes unnecessary when auxiliary results suffice to 
ensure the growth of A small (namely, in the cases of fixed or random generators). 

Estimates on growth in Z/pZ will be proved in §21 and part (|aj) of the key proposition 
will be reduced thereto in §31 Given part (jaj), it suffices to prove fb]) for very large A - 
and this is a relatively simple task (§3]), yielding to the use of growth estimates coming 
from Fourier analysis. 

1.4. Work to do. A natural next step would be to generalise the main results to the 
group SL^Fpa), a > 1. At first sight, this does not seem too hard; however, there seem 
to be actual difficulties in making the result uniform on a. 

A generalisation to SL n (Z/pZ) for n > 3 is likely to require a great deal of original 
work. The arguments in E j4.lti4.3l should carry over, but those in §2] and §4.41 do not. It is 
possible that the basic approach in §4.1114.31 will eventually prove itself valid for all simple^ 
groups of Lie type, but it is too soon to tell whether something will be found to replace 
§21 and §4.41 in a general context. 

No attempt has been made to optimize - or compute - the constant c in the main 
theorem, though, like the implied constant, it is effective and can be made explicit. Actual 
numerical constants will sometimes be used in the argument for the sake of notational 
clarity. 

1.5. Further remarks. There is a rich literature on the growth of sets in linear algebraic 
groups over fields of characteristic zero: see, most recently, |EMQ| . In such a situation, 
one has access to topological arguments without clear analogues in Z/pZ. It is possible, 
nevertheless, to adapt the vocabulary of growth on infinite groups to the finite case. 
For example, one can say the key proposition implies immediately that A does not have 
moderate growth ([DSC2]). 

The problem of bounding the diameter of r(SL2(Z/p fe Z), A) for p fixed and k variable 
is fundamentally different from that of bounding the diameter of r(SL2(Z/pZ), A) for p 
variable. From a p-adic perspective, the problem for SL^Z/p^Z) is analogous to that for 

^ The diameter of a Cayley graph T(G, A) of a solvable linear algebraic group G can be large: for 
example, G could be generated by the set A of all elements of G all of whose eigenvalues lie in B, where 
B C (F p a)* is a set that grows very slowly when multiplied by itself. By the Lie-Kolchin theorem, the 
eigenvalues of A ■ A • • • A will lie in B ■ B • ■ ■ B, which, by assumption, is only slightly larger than B. (See 
also ET .) It is unclear whether the present paper's approach will be directly applicable to groups that 
are neither solvable nor simple (nor almost simple). 
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SU(2), which was treated by Solovay and Kitaev [NC] . Dinai [Dij has succeeded in giving 
a polylogarithmic bound for diam(r(SL2(Z/p fc Z), A)), p fixed, in part by adapting Solovay 
and Kitaev's procedure. 

Consider the family & = {r(SLi2(Z/pZ), A)} p> a, where both p and A vary: p ranges 
across the primes and A ranges across all sets that generate SL/2(Z/pZ). If we could prove 
that & is an expander family, we would obtain the main theorem with the constant c 
set to 1. We are still far from proving that & is an expander family, and we will not, 
of course, assume such a hypothesis; rather, we will obtain a weaker statement as an 
immediate consequence of the main theorem (Cor. 16. ip . It seems unjustified for now to 
hope for a purely combinatorial proof that a family of Cayley graphs {T(G,A)} where 
both G and A vary quite freely is an expander family: we would need, not estimates on 
the growth of a set A when added to or multiplied by itself, but, instead, estimates on 
the growth of a set A under the action of addition or multiplication by a small, fixed set 
S, or under the action of a small set of operations. (Here "small" means "of cardinality 
less than a constant".) Such estimates are outside of the reach of the already remarkably 
strong sum-product techniques of [BKT] and |Ko| . 

1.6. Acknowledgments. I would like to thank A. Venkatesh for having first called the 
problem to my attention and for shedding light spontaneously. His Clay Mathematics 
Institute grant paid for a trip during which the present subject and many other interesting 
things were discussed. I was otherwise funded by the Centre de Recherches Mathematiques 
and the Institut de Sciences Mathematiques (Montreal). 

Thanks are also due to N. Anantharaman, E. Breuillard, O. Dinai, U. Hadad, C. Hall 
and G. Harcos, for their careful reading and several helpful comments, to A. Gamburd, 
A. Lubotzky and I. Pak, for their instructive remarks and references, and to A. Granville, 
for his encouragement and advice, and for access to an unpublished set of lecture notes. 

2. Background and preliminaries 

2.1. General notation. As is customary, we denote by F p a the finite field of order p a . 
We write \ f\ r for the L r -norm of a function /. Given a set A, we denote its cardinality 
by \A\, and its characteristic function by A itself. Thus, |A| = |A|i. By A + B (resp. 
A ■ B), we shall always mean {x + y : x G A, y £ B} (resp. {x ■ y : x £ A, y £ B}), or the 
characteristic function thereof; cf. (A * B)(x) = \{(y,z) £ A x B : y + z = x}\. By A + £ 
and £ • A we mean {x + £ : x G A} and {£ • x : x £ A}, respectively. 

For us, A r means {x r : x £ A}; in general, if / is a function on A, we take f(A) to 
mean {f(x) : x £ A}. Given a positive integer r and a subset A of a group G, we define 
A r to be the set of all products of at most r elements of A U A -1 : 

Ar = {gi ■ 92 ■ ■ ■ 9r ■ 9i G A U A' 1 U {1}}. 

Finally, we write {A) for the group generated by A. 

2.2. Fourier analysis over Z/pZ. We will review some basic facts, in part to fix our 
normalizations. The Fourier transform / of a function / : 7Ljp7L — » C is given by 

f(y)= E f(x)e~ 27Tixy/p - 
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The Fourier transform is an isometry: 

E \m\ 2 =p- E i/^)! 2 - 

zgZ/pZ x&Z/pZ 

For any /,g : Z/pZ -> C, we have /*# = / • <?. If 4, 5 C Z/pZ, then |A* B|i = 

2.3. Additive combinatorics, abelian and non-abelian. Some basic concepts and 
proofs of additive combinatorics transfer effortlessly to the non-abelian case; some do not. 
In the following, G need not be an abelian group, except, of course, when it is explicitly 
said to be one. 

Definition 1. Let A and B be finite subsets of a group G. We define the Ruzsa distance 

d(A,B) = log | \ AB 1 1 . 

If G is an abelian group whose operation is written additively, we denote the Rusza 
distance by d + (A,B). 

The Ruzsa distance, while not truly a distance function (d(A, A) ^ in general), does 
satisfy the triangle inequality. 

Lemma 2.1. Let A, B and C be finite subsets of a group G. Then 

(2.1) d(A,C) < d(A,B) + d(B,C). 
Proof (Ruzsa). It is enough to prove that 

(2.2) lAC^WB] < lAB^WBC' 1 ]. 

We will do as much by constructing an injection l : AC" 1 xB^ AB" 1 x BC~ l . For every 
d G AC" 1 , choose once and for all a pair (ad,Cd) £ Ax C such that d = adC^ 1 . Define 
i(d,b) = (arf6 _1 , bcj 1 ). We can recover d = a^c^ 1 from L(d,b); since (0^,0^) depends only 
on d, we recover (ad,Cd) thereby. From i(d,b) and (ad,Cd) we can tell b. Thus, 1 is an 
injection. □ 

In particular, we have 

(2.3) d(A, A) < d{A, A" 1 ) + d(A~ l , A) = 2d{A, A" 1 ). 
If G is abelian, then, by [Ru2], Thm. 2, 

(2.4) diA^A" 1 ) < M(A,A). 

This need not hold if G is not abelian: if A is a coset gH of a large non-normal subgroup 
H C G, we have |j4j4 _1 | = \H\ = \ A\, but \AA\ = \HgH\ may be much larger than 
and thus d(A, A -1 ) is unbounded while d(A, A) = 0. 

Another peculiarity of the abelian case is that, if A A is large, then A • A must be 

large. If G is not abelian, and A is of the form H U {g}, where H is a large subgroup of 
G, then |A • A| < 3\H\ + 1 < 3|A|, while A - A - A contains HgH, and thus may be very 
large. However, the following auxiliary result does hold even for G non-abelian. 
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Lemma 2.2. Let n > 2 be an integer. Let A be a finite subset of a group G. Suppose that 

\A n \ > c\A\ l+ \ 

for some c > 0, e > 0. Then 

\A-A-A\> c'\A\ 1+€ ', 
where d > 0, e' > depend only on c, e and n. 
Proof. By (J22J), 

\A n - 2 A 2 \ \A-A 2 \ < \An-tl \A 3 \ 



\A\ ~ \A\ \A\ ~ \A\ 

Proceeding by induction on n, we obtain that 



< 



\A\ ~ V \A\ 

It remains to bound | ^3 1 / 1 v4. | from above by a power of \A ■ A ■ A\/\A\. Again by (j2.2j) . 

I^^- 1 !!^! = iaaa- 1 !!^- 1 ! < ia^iia- 1 ^- 1 ! < \aaa\ 2 

{2 " >] {AA^AWAl < \AA~ 1 A~ 1 \\AA\ = lAAA^WAAl < \AAA~ l \\AAA\. 

Bound \AA~ l A~ l \,\A~ l AA\,...,\A~ l A- l A- 1 \ in terms of \AAA\ and \A\ by reducing 
them to either case of ()2.5|) : take inverses and replace A by A' 1 as needed. □ 

2.4. Regularity. The following is a special case of the Gowers-Balog-Szemeredi theorem. 

Theorem 2.3. Let A be a finite subset of an additive abelian group. Let S be a subset of 
A x A with cardinality \S\ > \A\ 2 /K . Suppose we have the bound 

\{a + b: (a,b) G S}\ < K\A\. 

Then there is a subset A' of A such that \A'\ > ci^ _c, |^4| and 

\A' + A'\ < CK C \A\, 

where c > and C > are absolute. 

Proof. By |Golj . Prop. 12, with B = A, there are sets A',B' C A such that \A'\, \B'\ > 
cK~ c \A\ and \A' — B'\ < CK C \A\. By the pigeonhole principle, there is a z such that 
a - b = z for at least C- 1 c 2 K- 3C \A\ pairs (a, b) £ A' x £?'. Thus, |V| > C- 1 c 2 K- 3C \A\, 
where we define V = A' n (-B' + 2). At the same time, V — V C (A' — £?') — z, and so 
\V-V\ < CK C \A\. By d(V, -V) < 3d(V,V), and so + < ^if 12C |y|. We 

redefine A' to be V and are done. □ 



2.5. Sum-product estimates in finite fields. 
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2.5.1. Estimates for small sets. It is a simple matter to generalize the main result in [Koj 
to finite fields other than F p . 

Theorem 2.4. Let q = p a be a prime power. Let 5 > be given. Then, for any A C F* 
with C < \A\ < p 1 ^ 5 , we have 

max(\A-A\,\A + A\) > \A\ 1+t , 

where C > and e > depend only on 5. 

Explicit values of C and e can be computed for any given 5 > 0. 

Proof. The proofs of [HBK], Lem. 5, [Koj, Lem. 5, and [Koj . Thm. 2, work for any finite 
field F* without any changes. (In the statements of [Koj . Lem. 5 and Thm. 3, the conditions 
1^1 < \/W\ an d \B\ < \/\F\ need to be replaced by \A\ < y/p and \B\ < y/p.) For the 
range \A\ > p 1 / 2 , use [BKTj . Thm. 4.3. □ 

Note the condition \A\ < p 1 ^ 5 in Thm. 12.44 where one might expect \A\ < q 1 ^ 5 . A 
subset A of F* may be of size about p and fail to grow larger under multiplication by 
itself: take, for instance, A = (F p )*, viewed as a subset of F* One can prove a version of 
Thm. 12.41 in the range p 1 < A < q 1 - 5 (see [BKTj . Thm. 4.3), but we will not need to 
work in such a range. Hence also the condition \A\ < p 1 ^ 5 in Prop. 13.11 and Prop. [3731 

2.5.2. Estimates for large sets. 

Lemma 2.5. Let p be a prime, A a subset of ¥ p , S a subset of F* Then there is an 
element £ € S such that 

!A f 1 1 V 1 1 ( \S\\A\ 

\A + iA\ > -+ >-mmU, 1 



K p \S\\A\ 2 /p) " 2 V P 
Furthermore, for every c £ (0, 1], there are at least (1 — c)\S\ elements £ G S such that 



\A + £A\>cl- + 



1 1 ^ 



,P \s\\A\y P 

Cf. [Koj . Lem. 2, which is stronger when \A\ < p 1 / 2 . 

Proof. Let us take Fourier transforms and proceed as in the beginning of the proof of 
Thm. 6 in [B^Kj : 

tes £es £es €eSxew p 

< \S\\A(0t + E E \Mx)A(y)\ 2 = |S|L4| 4 + I E \A(x)f 
xew* yev* \x&* 

= \S\\A\* +P 2 (\A\ 2 ) 2 = \S\\A\ 4 +p 2 \A\ 2 . 

Hence, there is an element £o £ S such that 

\A*t A\ 2 < (\ A \\P\ A \ 2 

lAHoAl2 -{-Y + ^sT 
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and for every c E (0, 1], there are at least (1 — c)\S\ elements £ E S such that 




|A*£4|? < \A + £A\ ■ \A*£A\l 
As \A * x^4|i = |>1| 2 for every x E F* we obtain that 




for at least one £o E 5, and 




for at least (1 — c)|5| elements £ £ S. 



□ 



3. Expanding functions on F 9 



Let / be a fairly unexceptional polynomial on x and y (or on x, y and It 
is natural to expect a result of the following type to hold: for every 5 > and some r, 
e > and C > depending only on (5, every set A C F p with C < |A| < p 1 ' 5 must 
fulfill \ f(A r ,A r )\ > \A\ 1+e . The work in |BKT] and [Ko] amounts to such a result for 
f(x,y) = x + y. We will now see how to derive therefrom a result of the same type for 
some other choices of f(x,y). 

Proposition 3.1. Let q = p a be a prime power. Let 5 > be given. Then, for any AcF* 
with C < \A\ < p 1 ^ 5 , we have 



where C > and e > depend only 5. 

Proof. Let w(x) = x + x^ 1 . Suppose \{w(x)w(y) : x,y E A2W < l^l 1 " 1 " 6 - It follows 
directly that \A%\ < ^\A\ 1+e . Since w(x)w(y) = w(xy) + w(xy~ 1 ), and the cardinality 
of S = {(w(xy),w(xy~ 1 )) : x,y E ^4} is at least | v4| 2 / 16, we may apply Thm. [2T3l and 
obtain that there is an A' C A2 (which may be taken to be closed under inversion) such 
that \A'\ > c'|yl| 1_c " e and \w(A') + w(A')\ < C'\A\ 1+C ' e . At the same time, we have 
\w(A')w(A')\ < 110(^2)10(^2)1 < |^4| 1+<E - By Thm. l2~4l we have a contradiction, provided 
that e is small enough and C is large enough. □ 

Lemma 3.2. Let A and B be subsets of a group G. Then A can be covered by at most 
\A ■ B\/\B\ cosets ajl?2 of B2, where aj £ A. 

This is the non-commutative version of an argument of Ruzsa's (|Ru|). 

Proof. Let {a%, 02, . . . , a^} be a maximal subset of A with the property that the cosets ajB, 
1 < 3 < k, are all disjoint. It is clear that k < \A-B\/\B\. Let x £ A. Since {a±, 02, • • • , afc} 
is maximal, there is a j such that ajB n xB is non-empty. Then x £ a 
Thus, the sets OjBi cover A. □ 



|{(x + x x )-(y + y x ) :x,yeA 2 }\ > \A\ 
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Proposition 3.3. Let q = p a be a prime power. Let 5 > and a\,a 2 G be given. 



Then, for any icF! with C < \A\ < p x ~ 5 , 



KaiCxy + x-V 1 ) + a 2 (x- 1 y + xy- 1 ) : x,y £ A 20 }\ > \A\ 1+e , 
where C > and e > depend only on S. 

Proof. By Lemma 13.21 we may cover with at most |^4 • A 2 |/|^4 2 | cosets aiA 2 ,..., a^A\ 
of where a,j G Aj. Given x, y G ^2 such that xy G ajA^, we know that xy^ 1 = 
(xy)y~ 2 G g^-A 2 . By Proposition 13.11 and the pigeonhole principle, there is an index j such 
that 

(3-1) |{(r + r' 1 ) + (s + s- 1 ) : r, s G aj A 2 }\ > ^^L— 

Since \A 4 ■ A 2 \/\A 2 \ < 2\A 6 \/\A\, we have either 2|A 6 | > \A\ 1+t / 4 or 

\A\ 1+ ' 



> \A\ 1+3£/4 . 



\A A -A 2 \/\A 2 \ 

In the former case, we are already done. So, let us assume 2|^4.g | < | 1 + e / 4 . 

Write B = a,jA\ C A 12 . Since \B\ < \A A \ < \A\ 1+e / 4 , inequality ([3TT]) implies that 



d + (w(B),-w(B)) > |logL4|. 



By (J23J, we obtain that 
Then, by the triangle inequality (|2.ip . 



d+(u;(B),«;(B)) > Jlog|A| 



d + (a l w{B) 1 -a 2 w(B)) > ±d+(w(B),w(B)) > ^\og\A\. 



In other words, 



(3.2) \{ ai (r + r- 1 ) + a 2 (s + s- 1 ) :r,s€B}\> \B\\A\ e / 12 > l\A\ 1+e ' n . 

For any r,s £ B, the ratio r/s is in A\A~ l 2 C j4|. Let y £ ^ be such that y 2 = r/s; 
define x = r/y £ A 2 q. Then r = xy and s = x/y. Therefore 

{a\(r + r~ l ) +a 2 (s + s _1 ) :r,s£6}c {ai(xy + xy _1 ) + a 2 (xy~ l + x _1 y) : x,y G ^20}- 
By (|3.2p . we are done. □ 



4. Traces and growth 

In §4.11 we will see how, if A C SL2(F P ) fails to grow, it must commute with itself to 
a fair extent, so to speak. The arguments in §4.21 are familiar from the study of growth 
in complex groups. The results in §4.31 will follow from those in §4.11 by means of simple 
combinatorial arguments. We will be able to prove the main part of the key proposition 
in $OJ using the results in $3] and gTrEOS 



10 



H. A. HELFGOTT 



4.1. Growth and commutativity. We will first see that, if a subset A of any group 
G does not grow rapidly under multiplication by itself, there must be an element g of A 
with which many elements of A commute. We shall then use the fact that, in a linear 
algebraic group, two elements hi, hi that commute with a given g with distinct eigenvalues 
Ag ; i, . . . , \g tTl must also commute with each other. Since non-unipotent elements are easy 
to produce in SL^i^) (Lem. I4.2p . we will conclude that every given subset A of SL2(-ftT) 
either grows rapidly or contains a large simultaneously diagonalizable subset (Cor. [4~3j) . 

Proposition 4.1. Let G be a group and A a non-empty finite subset thereof. Let A A be 
the set of conjugacy classes of G with non-zero intersection with A. For g £ G, let Cc(g) 
be the centralizer of g in G. Then there is a g £ A such that 

l^)n(^)|>^ML. 

Proof. Let g,h\ } hi G A. If high^ 1 = high^ 1 , then h~^h\ G A~ 1 A commutes with g. 
Hence, for any g G G, 

\ihgh- 1 : h G A}\ > ' S 



\C G (g)nA- 1 A\- 
Let T C A be a set of representatives of A a- Then 

\AAA~ l \ > \ihgh- 1 : h G A,g G T}| > V — . — r-r . 
If |<7 G ( 5 ) n < r^Mjq for every g G T, then 



E 



>TN — — ^ = L4-^-A _1 



r |6V;0;)nA-M 1 1 |A A | 

and we reach a contradiction. □ 

Lemma 4.2. Lei K be a field. Let A be a finite subset of SL2(K) not contained in any 
proper subgroup ofSLi2(K). Then A2 has at least \\A\ — 1 elements with trace other than 
±2. 

Proof. Let g G A be an element of trace 2 or —2 other than ±1. Let B C A be the 
set of all elements of A with trace ±2 and an eigenvector in common with g. Suppose 
\B\ < + 3. Let h G A \ B. If /i has trace ±2, then either g/i or g~ 1 h does not. 
Therefore AU A ■ A\J A~ 1 A has at least ^|^4 \ B\ > 4 1^4.| — 1 elements with trace other 
than 2. Suppose now \B\ > j\A\ + 3. Let h be an element of A that does not have an 
eigenvector in common with g. Then there are at most two elements g' of B such that g'h 
has trace 2. Hence A ■ A has more than \\A\ + 1 elements with trace other than ±2. □ 

Corollary 4.3. Let K be a field. Let A be a non-empty finite subset of SL2(X) not 
contained in any proper subgroup of SLzi^K). Assume |Tr(A)| > 2, \A\ > 4. Then there 

(I Tr(A)\— 2)(-\A\— 1] 

are at least m 4 simultaneously diagonalizable matrices in A4. 
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Proof. Let B be the set of elements of Ai with trace other than ±2. By Lemma 14,21 
\B\ >\\A\ — 1. We may apply Prop. [4~T1 and obtain that there is a g G B such that 

All elements of V = Cc(g) H (B~ 1 B) commute with g; since Tr(g) 7^ ±2, it follows that, 
when g is diagonalized, so is all of V. □ 

4.2. Escaping from subvarieties. The following lemmd^ is based closely on [EMO|, 
Prop. 3.2]. 

Lemma 4.4. Let G be a group. Consider a linear representation of G on a vector space 
V over a field K . Let W be a union W\ U W2 U . . . U W n of proper subspaces of V. 

Let A be a subset of G; let G be an (A) -orbit in V not contained in W . Then there are 
constants rj > and m depending only on n and dim V such that, for every x G G , there 
are at least max(l,?7|A|) elements g G A m such that gx G' W. 

This may be phrased as follows: one can escape from W by the action of the elements of 
A. One can give stronger and more general statements of this kind; the spaces W n could 
very well be taken to be varieties instead. However, what we have just stated will do. 

Proof. Let us begin by showing that there are elements gi,...,gi G A r such that, for every 
x G G, at least one of the gi ■ x's is not in W. (Here I and r are bounded in terms of 
n and d = d\m.V alone.) We will proceed by induction on (d\y,sw), where dw is the 
maximal dimension of the spaces Wi, ■ ■ ■ ,W n (i.e., dw = maxi<j< n dim(Wj)) and s\y is 
the number of spaces of dimension dw among W%, . . . , W n . We shall always pass from W 
to a union of the form W' = W[ U • • • U W^, where either (a) dw 1 < dw or (b) dw 1 = dw 
and sw' < sw- The base case of the inductive process will be (dw,s\v) = (0,0). 

Let W+ be the union of subspaces Wj, 1 < j < n, of dimension dw (the maximal 
dimension). If W+ and G are disjoint, we set W = W\ W+. Suppose otherwise. Since G 
is not contained in W + , we can find xo G W + nG, g G AuA^ 1 such that gxo W+. Hence 
the set of subspaces of maximal dimension in W is not the same as the set of subspaces of 
maximal dimension in W'. It follows that W = gW D W does not contain W+, and thus 
has fewer subspaces Wj of dimension dw (the maximal dimension) than W has. 

We have thus passed from W to W', where either (a) d' w < dw or (b) d' w = dw and 
s 'w < s w- By the inductive hypothesis, we already know that there are g[,. . . ,g' v G A r > 
such that, for every x G G, at least one of the g[ ■ x's is not in W'. (Here I' and r' are 
bounded in terms of n' and d = dim V alone; the number nf of subspaces W{, W^, ■ ■ ■ , W n , 
is bounded by n 2 .) Since at least one of the g[ ■ x's is not in W 1 = gW n W, either one of 
the g\ ■ x's is not in W or one of the g[ ■ x's is not in gW, i.e., one of the g~~ x g\ • x's is not 
in W. Set 

g\ = g[, 52 = g' 2 i ■ ■ ■ , 91 = g'i 
gi+\ = g^g'i, gi+2 = g' x g^ ■■■■> 921 = g~ l g{, l' = 2/. 



^Thanks are due to N. Anantharaman for pointing out an inaccuracy in a previous version of this paper, 
and to both N. Anantharaman and E. Breuillard for help with the current phrasing. 
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(As can be seen, g$ £ A r , where r = r' + 1.) We conclude that, for every x £ G, at least 
one of the gt ■ sc's is not in W. 

The rest is easy: for each x G G and each 5 E A, at least one of the elements gig ■ x, 
1 < i < I (gi & A r ) will not be in W. Each possible gig can occur for at most I different 
elements g £ A; thus, there are at least min(l, \ A\/l) elements h = gig of A r+ \ such that 
hx $W. □ 

We derive some immediate consequences. 

Corollary 4.5. Let K be a field. Let A be a finite subset ofSlj 2 (K) not contained in any 

2 

proper subgroup of$L 2 {K). // \K\ > 3, the following holds: for any basis {vi,v 2 } of K , 
there is a g £ A^ such that gvt 7^ Xvj for all choices of X € K, i,j G {1, 2}, where k is an 
absolute constant. 

Proof. Consider G = SL^-fC) and its natural action on the vector space V = M 2 {K) of 
2-by-2 matrices. Let W be the subset of V consisting of all h £ V such that hvi = Vj for 
some i,j £ {1,2}. Let x be the identity in M2(K). Apply Lemma 14.41 

Before Lemma [4.4l can be applied, we must verifjHthat the orbit G = SL 2 (K) of x is not 
contained in W. Let Gij be the set of matrices g in SL 2 (K) such that gvi is a multiple of 
vj. Since W(K)P\G = Gi il UG lt 2 l JG 2l i l JG 2l 2, we would like to bound \G itj \. Let g £ G itj . 
Choose a vector v £ K 2 (say v = (1,0) or v = (0, 1)) that is not a multiple of t> j. It is clear 
that gv and gvi determine g. At the same time, we already know that gvi = Xvj, and, if 
gv is fixed, two different values of A determine two matrices g with different determinants; 
in particular, at most one X £ K gives us a g £ SL2(if). Thus gv actually determines g. 
Since gv must be non-zero and lie in K 2 , we conclude that \Gij\ < \K\ 2 — 1. 

The sets G ltl and G 2 , 2 intersect at the identity. Thus, \W{K) n G\ < 4(|K| 2 - 1) - 1. 
Since | SL2(if)| = \K\ • (|i^| 2 — 1), it is enough to assume \K\ > 4 to conclude that 
\W(K) n G\ < I SL 2 (K)\. In particular, for \K\ > 4, the set G = SL 2 (K) is not contained 
in W. We are entitled to apply Lemma 14.41 after all. □ 

Corollary 4.6. Let K be a field. Let A be a finite subset ofSL2(K) not contained in any 

proper subgroup ofSL2(K). Then there are absolute constants k,c > such that, given 

2 

any two non-zero vectors V\, v 2 G K , 

\A k \ (H Vl UH V2 )\ > c\A\, 
where H v = {g £ SL2(K) : v is an eigenvector of g}. 

Proof. Consider G = SL 2 (K) and its natural action on V = M 2 {K). Let W = H' Vl U H' V2 , 
where H' v = {g £ M 2 (K) : v is an eigenvector of g}. Let x = L. 

Before we apply Lemma 14.41 we need to check that SL 2 (K) is not contained in W{K). 

Since the matrices (j }) , ( J fj and j) share no eigenvectors, there is no pair 

of eigenvectors v\, v 2 such that each of three matrices has at least one of v%, u 2 as an 
eigenvector. Thus SL 2 (K) (f. W(K). Now apply Lemma 14.41 □ 

Lemma 14.21 could be derived from Lemma 14.41 as well, but, since the proof of Lemma 
2] is simple as it is, we will not bother. 



Thanks to O. Dinai for the counting argument about to be used. 
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4.3. Size from trace size. Given a large set V of diagonal matrices and a matrix g ^ V 
with only non-zero entries, one can multiply V and 5 to obtain at least 3> \V\ S different 
matrices. 

Lemma 4.7. Let K be a field. Let V C SL 2 (K) be a finite set of simultaneously diago- 
nalizable matrices; call their common eigenvectors v\ and v 2 . Let g 6 S1j 2 (K) be such that 
gvi / Xvj for any A G K , i, j £ {1, 2}. TTjen 

\VgVg~ l V\ > ~ [\\V\ - 5 J |F| 2 . 



2 V4 

Proof. Diagonalize V, conjugating by an element of $1j 2 (K) if necessary. Write (7 
° d ) ' ^ assumption, abed 7^ 0. Then 

-|\ / r j -1 / ra <^ ~~ r~ 1 bc (r _1 — r)a& 

^ " ' 9 I r _1 y 5 ~~ I (r - r -1 )cd r~ l ad — rbc 

the product of whose upper-right and lower-left entries is — (r — r~ 1 ) 2 abcd. The map 
r 1— ► — (r — r~ l ) 2 abcd cannot send more than 4 distinct elements of K* to the same 
element of K. Thus, the set {h\ 2 h 2 \ : h £ gVg" 1 } has cardinality at least |V|/4. The 
upper-left and lower-right entries of the matrix in the right-hand side of (|4.ip can be both 
equal to only if r 2 — r~ 2 = 0, and that can happen for at most 4 values of r. Let 
U = {he gVg~ l : (h n hi 2 h 2 i / 0) A {h 22 hi 2 h 21 / 0)}; we have that \{hi 2 h 2 \ : h € U}\ > 



Let h £ U he fixed. Define 
fh(s,t) = 



s \ f h\\ h±2 \ ( t \ _ / si/iii X ^i2 

121 /l22 A t- 1 ) ~ 1 S _1 t/l21 S-H- X h 2 2 



The product of the upper-right and lower-left entries of /^(s, i) is h\ 2 h 2 \, which is indepen- 
dent of s and t. Since h £ U, we may recover s 2 , t 2 and st from h and fh(s,t). Thus, for 
h fixed, there cannot be more than two pairs (s,t) sharing the same value of fh(s,t). For 
each element of {h\ 2 h 2 i : h £ U}, choose an h corresponding to it; let s and t vary. We ob- 
tain at least ^ | {^12^21 : h £ U}\\V\ 2 different values of fh{s,t) £ VgVg~ l V. We conclude 
that {VgVg^V} has cardinality at least \\{h l2 h 21 : h £ U}\\V\ 2 = K{\V\ - 5)\V\ 2 . □ 

We will now use Cor. 14.31 Cor. 14.51 and Lem. l4~Tl to show that, unless A grows substan- 
tially under multiplication by itself, the cardinality of cannot be much smaller than 
the cube of the cardinality of the set of traces Tr(^4) of A. 

Proposition 4.8. Let K be a field. Let A be a finite subset of SL 2 (K) not contained in 
any proper subgroup of SL 2 (K) . Assume \ Tr(A)\ > 2, \A\ > 4 and \K\ > 3. Then 

l/l (|Tr(A)| -2)(1|A| - 1) ^ f (\Tr(A)\-2)(\\A\-iy 
2 ^4 \A 6 \ 

where k is an absolute constant. 
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Proof. By Cor. 14.31 there is a simultaneously diagonalizable subset V C with |V| > 

(I Tr(j4)|— 2)(- \A\ — 1) 

m 4 ; call its common eigenvectors v\ and v 2 . Since A is not contained in any 

proper subgroup of SL^if), Cor. 14.51 yields a g £ A k such that gVi ^ Xvj for all A G K, 
i,j G {1,2}. Hence, by LemmaEfl \VgVg~ l V\ > \ (j|V|-5) |V| 2 . □ 

We must now prove that, unless A grows substantially when multiplied by itself, the 
cardinality of Tr(-Afc) cannot be much smaller than the cube root of the cardinality of A. 
A preparatory lemma is needed. Like Lem. 14.71 it is of a very simple type - the cardinality 
of a set is bounded from below by virtue of its being contained the image of a map that 
has a large enough domain and is not too far from being injective. 

Lemma 4.9. Let K be a field. Let A be a finite subset ofSL2(K). Write the matrices in 
SL 2 (K) with respect to a basis {t>i,t>2} of K . Suppose 912921 7^ for every g G A. Then 

\Tr( AA- 1 )] > — — ^ — . 

1 V n ~ 2-\{(g n ,g 22 ):geA}\ 

Proof. Let D = {(911,(722) : 9 6 ^4}- Consider any two distinct g,g' G B with 911 = g' n , 
922 = 922- Then gg' -1 has trace 

Vn9 2 2 - ±S 



Tr (99' 1 ) = 911922 + 9229ii ~ 912921 ~ 921 



921 



Thus, given 9 G B, there can be at most two g' G B with 911 = g' u , 922 = g'22 such that 
Tr(99 /_1 ) is equal to a given value. Choose 9 such that \{g' G B : g' n = 911,922 = 922} I is 
maximal. □ 

Proposition 4.10. Let K be afield. Let A be a finite subset of SL2(if) not contained in 
any proper subgroup of SL2(K) . Then 

I Tr(^)| > clAI 1 / 3 , 

where k and c > are absolute constants. 

Proof. If A has an element of trace other than ±2, let h be one such element. Otherwise, 
choose any 91 G A other than ±J, and any 92 G A not in the unique Borel subgroup in 
which 91, being parabolic, lies; then either 9192 G A ■ A or 9^92 G A~ 1 A has trace 7^ ±2; 
choose h G A 2 , ti(h) 7^ ±2, to be one of the two. From now on, write all matrices with 
respect to the two eigenvectors vi, v 2 of h. We denote by r and r _1 the two eigenvalues 
oih. 

By Cor. 14.61 \X\ > c\A\, where X = A^ \ (H Vl U H V2 ) and k, c > are absolute 
constants. Lemma 14.91 now implies that 

(4.2) I Tr(A 2k0 )\ > I Tr^X- 1 )! > — ^ 

2 • 11(911,922) :g€X}\ 

For t G K, let D t = ({(911,922) : 9n + 922 = t,g G X}\. Let t G K be such that |Z?i| is 
maximal. For any (a, d) G Dt, we have ra + r~ l d = (r — r~ 1 )a + r~ 1 t. Thus, for any two 
distinct pairs (a, d), (a' , <f) G -Dt, the two values ra + r _1 d, ra' + r~ l d' must be distinct. 
Thus 

ItW/I \isnv^Y\lslnls 11(911,922) : 9 g ^}| 
|Tr(^ fco+2 )| > \ Tr{hX)\ > \D t \ > \Tr(X)\ ' 



GROWTH AND GENERATION IN SL 2 (Z/pZ) 15 

Multiplying by (14. 2|) . we obtain 

|Tr(A fe0+2 )||Tr(,4 2fc0 )|> |AI 



2|Tr(X)|' 



and so | Tr(^42fc )| 3 > I Tr(Afc 0+ 2)|| Tr(742fc )|| Tr(X)| > ^\X\, where we assume, as we may, 
that ko > 2. Hence 

/i \V3 1/3 

|Tr(^)|>Q|X|) >*s\A\W. 

□ 

4.4. Growth of small sets. The statements in the section up to now reduce the main 
problem to a question in F p 2 , and that question can be answered using the results in £j3j 

Proof of part (faj) of the key proposition. We may assume that p is larger than an absolute 
constant; otherwise we may make (jl.4p true simply by adjusting the constant c therein. 
By the same token, we may assume that \A\ is larger than an absolute constant. 

By Proposition 14.101 | Tr(.Afc )| > Co | A| 1//3 , where fco and cq are absolute constants. As 
we said, we may assume that \A\ > max((4/co) 3 , 8). Thus, by Cor. 14.31 there are at least 

(e \A\^-2)(\\A ko \-l) > e \A\^\A ko \ 



|^6fcol ~~ 16|-^6feol 

simultaneously diagonalizable matrices in A^ ; denote by V the set of the eigenvalues of 

r °° l^Ae^ ^ such matrices. Since we may assume that cq < 1, we have \V\ < | ^4 1 1 / 3 < 

pi-S/3 a ^ gQ take f or granted that |^46fe | < |^4| 7//6 j otherwise, by Lem. 12.2^ we are 

already done. Thus |V| > yl^ 1 / 6 , and so, given a C depending only on 5, we may 
assume that |V| > C by adjusting the constant c in (jl.4p accordingly. 

By Corollary 14.51 there is a matrix f ^ ^ ~\ G A kl such that abed ^ 0, where ki is an 

absolute constant. Now, for any scalars x, y, the trace of 

x \ / a b \ ( y \ / d -b 
x- 1 J \ c d J \ y- 1 J \ -c a 

is ad(xy + x _1 y _1 ) — bc(x~ 1 y + xy^ 1 ). Letting x, y range on all of V, we see that 
tr(^i60fc +2fei) = tr(A 2 o.4fco+A:i+20-4fco+fei) 3 {ad{xy + x~ l y- 1 ) - 6c(x _1 y + xy~ l ) : x,y G 
V20}. Now we apply Prop. 13.31 with q = p 2 , and obtain that 



tr(Ai 6 OA;o+2fc 1 ) I > \ V 



l+e 



where e > depends only on 5. Here we have assumed, as we may, that |V| > C, where 
C is the constant in the statement of Prop. 13.31 with 5 equal to one-third of our 5. 
By the same argument as when we took \V\ > y^y!! 1 / 6 , we may assume that 



161 

Tr(^-160fco+2fci)||^160fc +2fcil 
l^6(160fc +2fci)| 



> 40. 
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(Otherwise we are already done.) We proceed by applying Prop. SSI and obtain 

I a 1 I Tr (^160fc +2fci)| 3 |^160/co+2fci| 3 - 1 I^160fc +2fci| 3 , T/ |3(l+e) 

I^ 2 (i60fc +2fci)l ^ fi [3 > n is\y\ 

z l /i 6(160fc +2fei)l z l /i 6(160fc +2fci)l 

> 1 I^160fc +2fci| 3 C Q I ^-fcp 1 3 1+e > _Cg l^l 6 |,4| 1+£ 

~ 216 l^6(160fc +2fci)| 3 2l2 |^6fcol 3 ~ 228 l^6(160fc +2fci)| 6 

where ko is an absolute constant. Hence, either |-Ag(i60fco+2fei)l ^ l^/c2(i60£:o+2£:i)l must be 

c 3/7 " _ 

greater than ^g-|^4| 1+e ' . By Lemma 12.21 we are done. □ 

5. Generating the whole group 

Since we have proved part (jaj) of the key proposition, we know how to attain a set of 
cardinality p 3 ~ s , 5 > 0, by multiplying a given set of generators A by itself (log(p/|^4|)) c 
times. It remains to show how to produce the group SL2(Z/pZ) in a bounded number of 
steps from a set almost as large as SL2(Z/pZ) itself. As might be expected, instead of 
the sum-product estimates for small sets ( §2.5.1j) . we will use the estimates for large sets 
( §2.5.21) . We first focus on what happens in the Borel subgroups. 

Lemma 5.1. Let p be a prime. Let H be a Borel subgroup o/SL2(Z/pZ). Let A C H be 
given with \ A\ > 2p 5//3 + 1. Then As contains all elements of H with trace 2. 



Proof. We may as well assume that H is the set of upper-triangular matrices. Define 
P r {A) = |x G TLjpTL : ( ^ \ £ a\. By the pigeonhole principle, there is an r E 

7 ,/pZ)* such that |P r (A)| > 2p 2 / 3 . Let f * i^ 1 ) be any e ^ ement °^ ^ wit ^ t ^ r - 



t u \ I r x \ I t 1 —u \ ( r 1 



Then 

(0 r^l^o r 1 ;^ t 

equals 

r t 2 x + (r _1 - r)ut \ f r _1 —x'\ _f 1 r{—x' + t 2 x) + (l—r 2 )ut 



r- 1 J \ r J \ 1 

Therefore, Pi(AAA~ l A~ l ) is a superset of r(-P r (A) + t 2 P T (A)) + (1 - r 2 )ut. Define 
S - {t e (Z/.Z)., M r : 3. e Z/ P Z ,, (J ."^^ Z/pZ, « , r) . Clea^ 
|5| > |(2p 5 / 3 -p) > p 2 / 3 . By Lemma [231 there is a t G <S such that 

|r(-P P (A)+t 2 P r (A)) + (l-r 2 )«t| = |P r (A)-f 2 P r (yl)| > , 1 p > = |p. 

Thus, 

(r(P r (A) + t 2 P r (A)) + (1 - r 2 )^) + (r(P r (A) + t 2 P r (A)) + (1 - r 2 )ut) = Z/pZ. 
It follows that AAA~ l A~ l AAA~ l A~ l contains all matrices ^ J ^ V x G Z/pZ. □ 
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Proof of part $9$ of the key proposition. By part (jaj) of the main theorem, we may assume 
that \A\ > 6p 8 / 3 > (2p 5//3 + + 1). By the pigeonhole principle, there are at least 
(2p 5//3 + 1) matrices in A with the same lower row up to multiplication by a scalar in 
(Z/pZ)*; the same holds, of course, for the upper row. Thus, there are at least 2p 5 / 3 + 1 
upper-diagonal matrices and at least 2p 5//3 + 1 lower-diagonal matrices in C = AA~ l . By 

Lemma [5.1l Cg contains all matrices of the form ^ ^ 1 1 ' f y ^ ^ i 2/ ^ Z/pZ. Every 

element of SL2(Z/pZ) can be written in the form 

1 \ f 1 x \ f 1 \ f 1 x' 
y 1 J \ 1 )\y' 1 )\ 1 

where x,y,x',y'eZ/pZ. Hence SL 2 (Z/pZ) = CgCgCgCg ci H . □ 

iVote added in proof. A far more elegant proof of part (b) given part (a) may be obtained 
by an approach due to Gowers [Go2| ; see [NP] . In brief: in the present context, it is cleaner 
and simpler to do Fourier analysis on SL2(Z/pZ) itself, rather than to prove and use results 
based on Fourier analysis over Z/pZ (£ 12,5.21 <J5]). 

6. The main theorem and further consequences 

Proof of Main Theorem. The statement of the theorem follows immediately from the key 
proposition, parts (jaj) and fbj), when \A\ is larger than an absolute constant. Since \AU 
A ■ A\ > \A\ + 1 for any A not a subgroup of SL2(Z/pZ), we may increase the cardinality 
of A by an absolute constant C simply by multiplying A by itself C times. □ 

Let G be a finite group and A C G a set of generators of G. Let ip be a proba- 
bility distribution on G whose support contains A. We will assume throughout that tjj 
is symmetric, i.e., ip(g) = ipig -1 ) for every g £ G. We define the transition matrix 
T^(G,A) = {ip(y~ 1 x)} X: y£G- The largest eigenvalue of T^(G, A) is clearly 1. 

Consider a family {Gj, Aj}j^j of finite groups Gj and sets of generators Aj of Gj such 
that d = \Aj U Aj^-\ is constant. Let ipj(g) = \ if g € Aj U Aj 1 and ipj(g) = otherwise. 
If the difference between the largest and the second largest eigenvalues of T^^Gj, Aj) is 
bounded from below by a constant e > 0, then {T(Gj, Aj)}j £ j is a family of expander 
graphs. Now let {(Gj, Aj)}j € j be the family of all pairs (G, A) with G = SL2(Z/pZ), p 
varying over all primes, and A varying over all sets of generators of G with d = \ A U A~ 1 \ 
fixed. The question of whether this is a family of expander graphs may still be far from 
being answered. We can prove a weaker property that has certain consequences of its own. 

Corollary 6.1 (of the main theorem). Let p be a prime. Let A be a set of generators 
of G = SL2(Z/pZ). Let ip be a symmetric probability distribution on G whose support 
contains A; let r\ = mm g( z AuA -i il)(g). Then the second largest eigenvalue of T^(G, A) is 
at most 1 n C v?r , where c and C > are absolute constants. 

?j(logp) zc ' 

Here c is the same as in the main theorem. 



Proof. Immediate from the main theorem and the standard bound for the spectral gap in 
terms of rj and the diameter (see, e.g., |DSC| . Cor. 1). □ 
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From now on, assume for notational convenience that A = A 1 , and choose the following 
probability distribution on G: 



(6.1) i>{g) 



2\j\dA(g) if 9 is not the identity, 

t#a(s) + 2 if 9 is the identity, 



where 5^ is the characteristic function of A. For every positive integer n and every go E G, 
let </> n , go be the probability distribution on G defined as a vector 4> Hi g = (T^(G, A)) n 8 go , 
where the transition matrix T^(G,A) is as before and 5 go is the characteristic function of 
go seen as a vector of length \G\. We may regard (fr n ,g as the outcome of a so-called lazy 
random walk: start at a vertex go of T(G, A) and do the following n times - throw a coin 
into the air, take a random edge out of your current vertex if it is heads, but stay in place 
if it is tails. 

The mixing time mlxo a of the lazy random walk on T(G, A) is defined to be the smallest 
positive integer n such that 



b n,g {g) ~ T77T 



1 

\G\ 



1 

< -. 

~ 2 



(6-2) £ 

It is clear that mix(j ^ is independent of go. The constant | in (16. 2\) is conventional; if it 
were changed to 1/1000000, the mixing time would change by at most a constant factor. 

Corollary 6.2 (of Corollary 16. ip . Let p be a prime. Let A be a set of generators of 
G = SL2(Z/pZ). Then the mixing time mixQ^ is 0(\A\ (logp) 2c+1 ), where c and the 
implied constant are absolute. 

Again, the constant c is as in the main theorem. 

Proof. Immediate from Corollarv l6.1l via [DSC], Lemma 2. (For tp as in (|6.ip . the transition 
matrix T^(G,A) has no negative eigenvalues; see |DSC| . Lemma 1.) □ 



By a word on the symbols x\,X2, ■ ■ ■ ,x n we mean, as is usual, a product of finitely many 
copies of x\,Xy ,X2,X2 , ■ ■ ■ ,x~ 1 . A trivial word is a product of finitely many terms of 
the form gg~ l , where g is any word. 

Corollary 6.3 (of the key proposition, part (Jb|)). Let A be a set of generators of a free 
subgroup of SL2(Z). Let p be any prime for which the reduction A C SL2(Z/pZ) of A 
modulo p generates a free subgroup o/SL2(Z/pZ). Then the diameter of the Cayley graph 
r(SL2(Z/pZ), A) is Oyt(logp), where the implied constant depends only on A. 

We may take, for example, A as in (|1.2p or (|1.3p . with p > 5. 

Proof. Let <?i, #2, ••• ,ffn £ SL2(Z) be the elements of A. Let w(xi,X2, ■ ■ ■ ,x n ) be a non- 
trivial word on x±,X2, ■ ■ ■ ,x n . Since A generates a free group, w(g%, g2, ■ ■ ■ , g n ) / I- 
Suppose that w(g~i,g~2, ■ ■ ■ >5n) equals the identity in SL2(Z/pZ), where g%, . . . ,g n are the 
reductions mod p of g%, . . . ,g n . Then at least one of the entries of w(g±,g2, ■ ■ ■ ,g n ) must 
have absolute value at least p — 1. Yet it is clear that this is impossible if w is of length 
< fclogp, where k > is a constant depending only on A. (Cf. |Ma] .) 
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We thus have that any two distinct products of length at most | logp on the symbols 
xi, ... ,x n must take distinct values in SL2(Z/pZ) for x\ = gi, . . . ,x n = g n . We obtain 
that \A^2 1o sp\ | > n L-2 1o spJ . p or a n p larger than an absolute constant, we have logp J > 
p e , where e > depends only on k, and hence only on A. We apply part (jb]) of the 
key proposition to A*-% 1 °spJ j anc | conclude that diam(r(SL2(Z/pZ))) < Clogp for some 
constant C depending only on A. □ 

The following lemma seems to be folkloric. A more general statement was proved in 
unpublished work by A. Shalev |Lu2j . Similar results have been discovered independently 
by others; in particular, a generalization will appear in a paper by Gamburd et al. [Ga2j. 
We give a proof for the sake of completeness. 

Lemma 6.4. Let p be a prime. Let G = SL2(Z/pZ). Let c £ v be the set of all pairs 
(g,h) G G 2 such that g and h generate G. There is an absolute constant c > such that 
T(G,{g,h}) has loops of length < clogp for at most o(\^f p \) pairs (g,h) G ffp, where the 
rate of convergence to o/ofl^l) is absolute. 

Proof. Let w(g, h) be a non-trivial word. Let f\ 2 , /21 G Z[xi, X2, • • • , x n ] be the upper-right 
and lower-left entries of the matrix obtained by formally replacing all occurrences of g, h, 
g , h^ 1 in w(g,h) by the matrices 

( xi x 2 \ / x 5 xq \ ( X4 -x 2 \ ( x 8 -x 6 \ 
\ x 3 x 4 J ' \ x 7 x s J ' y -x 3 xi J ' \ -x 7 x 5 J ' 

respectively. Either f'12 or /21 is not identically equal to zero: let A be as in (jl.2p . and 
denote its elements by X and Y; since X and Y generate a free subgroup of SL2(Z), at 
least one of the upper-right and lower-left entries of w(X, Y) or w(Y, X) must be non-zero. 
(We cannot have w(X, Y) = —L = w(Y,X), and neither w(X, Y) = I nor w(Y,X) = L is 
possible.) 

Assume henceforth that the length I of w is at most lo ^ 2 2 ' ) . The coefficients of /12 and 
/21 are bounded above in absolute value by 2^ < p — 2. Hence at least one of the reductions 
/i2,/2i £ (Z/pZ)[a;i,2!2, ■ • • ,xs] is non-zero. Choose one of the non-zero reductions and 
call it P. 

Since P is a non-zero polynomial of degree at most £, there are at most 8£p 7 tuples 
(xi, . . . , Xs) 6 (Z/j?Z) 8 such that P(x\, • • • , x§) = 0. (While this follows immediately from 
the Lang- Weil estimates, it is also quite easy to give an elementary proof. For every tuple 
(X2, • • • , Xg) G (Z/pZ) 7 , either there are no more than £ values of x\ with P(x±, . . . , Xs) = 0, 
or /(i)(x2, • • • , xg) = 0, where /m is the leading coefficient of / considered as a polynomial 
on x\. If /(i)(x2, • • • , xg) = 0, repeat the argument with fn\ instead of / and (x 2 , ■ ■ ■ , Xg) 
instead of (x±, . . . , xg).) Take any g,h £ SL2(Z/pZ) such that w(g,h) = I. Then, for 
all ci,C2 G (Z/pZ)*, both the upper-right and lower-right entries of w(cig, c 2 h) are 0. 
Moreover, each pair c\g,c 2 h G M 2 {'L/p'L) can arise from at most four different pairs 
g,h G SL 2 (Z/pZ). Since every pair cig, c 2 h gives a distinct solution to P(xi, . . . , x 8 ) = 0, 
there are at most 32£p 5 pairs g, h G SL2(Z/pZ) such that 10(5, /i) = /. 

There are at most 4 l + 4 i_1 + • • • + 1 < 4 l+1 distinct words w on g and /i of length 
at most /. We conclude that, for every I < , there are fewer than 32/4' +1 p 5 pairs 

g, h G SL2(Z/pZ) such that w(g,h) = L for some non-trivial word if of length at most I. 
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Set I = 2 1 io g P 4 • O ur a i m i s to show that 32/4' +1 p 5 <C p 5,5 logp is small compared to \^f p \; it 
will suffice to show that few of the ((p 2 — l)p) 2 pairs (g, h) G (SL2(Z/pZ)) 2 are not in 

Every proper subgroup of SL2(Z/pZ) is contained in at least one of (a) 0{p) subgroups 
of SL<2(Z/p) of order 0(p 2 ), (b) 0(p 2 ) subgroups of order 0(p), or (c) 0(p 3 ) subgroups of 
order O(l), where the implied constants are absolute. Tautologically, a pair of elements 
of a group G fail to generate G if and only if they are both contained in some proper 
subgroup of G. Hence there are at most 0(p 5 ) pairs (g,h) G (SL2(Z/pZ)) 2 not in 

We conclude that there are at most 0{\^ p \(logp)/p 1/2 ) pairs (<?, h) G V p for which the 
graph T(G, {g, h}) has loops of length < 2 °^ g p 4 . (A trivial change in the argument would 
give the bound O e (\ e £'p\(logp)/p 1 ~ e ) for e > arbitrary.) □ 

We can now answer in the affirmative a question of Lubotzky's ( |Luj . Prob. 10.3.3). 

Corollary 6.5 (of the key proposition, part ©). Let p be a prime. Let G = SL2(Z/pZ). 
Let be the set of all pairs (g, h) G G 2 such that g and h generate G. There is an 
absolute constant C > such that diam(r(G, {g, h})) < Clogp for all pairs (g,h) G c € p 
outside a subset of& p of cardinality o(|^p|), where the rate of convergence to of o{\ c £ p \), 
is absolute. 

Proof. By Lemma EH all pairs (g,h) G ^ p outside a subset of ^ p of cardinality o(|^,|) 
yield graphs T(G,{g,h}) without loops of length < clogp, where c > is absolute. Let 

(g,h) be any such pair. Then \{g,h}^ logpi \ = |2 L f lo ^ \ < p 2 ^ . (Cf. the proof of Cor. 
16.31 ) We apply part (jbj) of the key proposition to A = {g, h}*-? logp ^ and are done. □ 

In Corollaries 16.31 and 16.51 only the second part of the key proposition was directly 
invoked. Of course, the proof of part ([b]) of the key proposition does use part (jaj), but only 
with \A\ > p s , where S > is fixed. This means in turn that the sum-product estimate 
(Theorem 12. 4p is used only for subsets of F* whose cardinality is greater than p £ , where 
e > is fixed. Thus, the results in [Ko] are not used. Since the sum-product estimates 
in |BKT] are purely combinatorial, the proofs of Cor. 16.31 and 16.51 are ultimately free of 
arithmetic. 

Note added in proof, (a) Bourgain and Gamburd have recently derived results much 
stronger than Corollaries 16.31 and 16.51 from the key proposition of the present paper; see 
[BGj . (b) There is now a proof ( [TV] . §2.8) of the sum-product theorem that does not 
involve Stepanov's method even for subsets of F* of cardinality smaller than p e . Thus, all 
that is not additive combinatorics has disappeared from what is employed in this paper. 
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